MAXSIM: An Automatic Metric for Machine Translation Evaluation Based on Maximum Similarity
نویسندگان
چکیده
This paper describes our participation in the NIST 2008 MetricsMATR Challenge, using our recently proposed automatic machine translation evaluation metric MAXSIM. The metric calculates a similarity score between a pair of English system-reference sentences by comparing information items such as ngrams across the sentence pair. Unlike most metrics, MAXSIM computes a similarity score between items, then find a maximum weight matching between the items such that each item in one sentence is mapped to at most one item in the other sentence. Evaluation on the WMT07, WMT08, and MT06 datasets show that MAXSIM achieves good correlations with human judgment.
منابع مشابه
MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation
We propose an automatic machine translation (MT) evaluation metric that calculates a similarity score (based on precision and recall) of a pair of sentences. Unlike most metrics, we compute a similarity score between items across the two sentences. We then find a maximum weight matching between the items such that each item in one sentence is mapped to at most one item in the other sentence. Th...
متن کاملPhrase-Based Evaluation for Machine Translation
This paper presents the utilization of chunk phrases to facilitate evaluation of machine translation. Since most of current researches on evaluation take great effects to evaluate translation quality on content relevance and readability, we further introduce high-level abstract information such as semantic similarity and topic model into this phrase-based evaluation metric. The proposed metric ...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملTESLA: Translation Evaluation of Sentences with Linear-Programming-Based Analysis
We present TESLA-M and TESLA, two novel automatic machine translation evaluation metrics with state-of-the-art performances. TESLA-M builds on the success of METEOR and MaxSim, but employs a more expressive linear programming framework. TESLA further exploits parallel texts to build a shallow semantic representation. We evaluate both on the WMT 2009 shared evaluation task and show that they out...
متن کاملSyntactic Features for Evaluation of Machine Translation
Automatic evaluation of machine translation, based on computing n-gram similarity between system output and human reference translations, has revolutionized the development of MT systems. We explore the use of syntactic information, including constituent labels and head-modifier dependencies, in computing similarity between output and reference. Our results show that adding syntactic informatio...
متن کامل